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Abstract. In this paper, we study a weak law of large numbers for the total internal length 
of the Bolthausen-Szmitman coalescent. As a consequence, we obtain the weak limit law 
of the centered and rescaled total external length. The latter extends results obtained by 
Dhersin & Mohle An application to population genetics dealing with the total number 
of mutations in the genealogical tree is also given. 



1. Introduction and main results 

In population genetics, one way to explain disparity is to observe how many genes appear 
only once in the sample. A gene carried by a single individual is the result of two possible 
events: either the gene comes from a mutation that appeared in an external branch of the 
genealogical tree, either this gene is of the ancestral type and mutations occured in the rest 
of the sample (see Figure [T]). We suppose that events of the second type occur in a much less 
frequent way than events of the first type (it is indeed the case when the size of the sample 
goes big). The total number of genes carried by a single individual is then closely related to 
the so-called total external length, which is the sum of all external branch lengths of the tree. 



11 
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Figure 1. In this genealogical tree, two mutations appear. Mutation 1 is 
in an internal branch and it is shared by 4 individuals. Mutation 2 is in an 
external branch so it is carried by 1 individual. In this example, an ancestral 
gene is also carried by 1 individual. This situation is negligible when the size 
of the sample is large. 
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The Bolthausen-Sznitman coalescent (see for instance [6J) is a well-known example of 
exchangeable coalescents with multiple collisions (see |16l [T7] for a proper definition of this 
type of coalescents). It was first introduced in physics, in order to study spin glasses but it 
has also been thought as a limiting genealogical model for evolving populations with selective 
killing at each generation, see for instance [HIS]. Recently, Berestycki et al. in [3] noted that 
this coalescent represents the genealogies of the branching brownian motion with absorption. 

The Bolthausen-Sznitman coalescent (11^,^ > 0), is a continuous time Markov chain with 
values in the set of partitions of N, starting with an infinite number of blocks/individuals. In 
order to give a formal description of the Bolthausen-Sznitman coalescent, it is sufficient to give 
its jump rates. Let n G N, then the restriction {Il[^\t > 0) of (II^, t > 0) to [n] = {1, . . . , n} 
is a Markov chain with values in Vn, the set of partitions of [n], with the following dynamics: 

(n) 

whenever IIJ is a partition consisting of b blocks, any particular k of them merge into one 
block at rate 

_ {k-2y.{b-k)\ 

so the next coalescence event occurs at rate 



1 — o V / 



k=2 

Note that mergers of several blocks into a single block is possible, but multiple mergers do 
not occur simultaneously. Moreover, this coalescent process is exchangeable, i.e. its law does 
not change under the effect of a random permutation of labels of its blocks. 

One of our aims is to study the total external length of the Bolthausen-Sznitman coalescent. 
More precisely, we determine the asymptotic behaviour of the total external length of the 
Bolthausen-Sznitman coalescent restricted to Vn, when n goes to infinity, and relate it to its 
total length L^") (the sum of lengths of all external and internal branches). A first orientation 
can be gained from coalescents without proper frequencies. For this class Mohle [15] proved 
that after a suitable scaling the asymptotic distributions of i^^") and L^"-* are equal. Now 
the Bolthausen-Sznitman coalescent does not belong to this class, but it is (loosely speaking) 
located at the borderline. Also it is known for the Bolthausen-Sznitman coalescent |7] that 

(2) (log")^ ^(n) _ log ^ _ log log n Z, 

fl n— >oo 

where denotes convergence in distribution and Z is a strictly stable r.v. with index 1, i.e. 
its characteristic exponent satisfies 

^-(0) = -logE e^^^ =^\e\-ieiog\e\, OgR. 

In their recent work, Dhersin and Mohle [9] showed that the ratio -E^") /L(") converges to 1 
in probability. Thus one might guess that E^"'^ satisfies the same asymptotic relation with 
the same scaling. It is a main result of this paper that such a conjecture is almost, but not 
completely true. 

Let us consider (Il[^\t > 0). We denote by C/^"'' the size of the k-th jump, i.e the 
number of blocks that the Markov chain loses in k-th coalescence event. We also denote 
by xj^^ for the number of blocks after k coalescence events. Observe that Xq"^ = n and 

X^"-* = — ujf^^ = n — t^i"''- Since the merging blocks coalesce into 1, there are 
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f/^"^ + 1 blocks involved in k-th coalescence event and, for / < 



in) _ b 1 



Xb b- 11(1 + 1)' 

Let T^"") be the number of coalescence events. More precisely 

r(") = inf{fc,4") = l}. 

According to Iksanov and Mohle ^T2\ (see also [8J), t^"'^ satisfies the following asymptotic 
behaviour 

(3) Oog^)% (n) _ log _ log log n Z. 

11 n—^co 

The main result of this paper describes the behaviour of the total internal length 

(n) 

when n goes to oo. In order to do so, we introduce the r.v. that represents the number 
of internal branches after k coalescence events. In other words, it is the number of remaining 
blocks which have already participated in a coalescence event. Note that at time all branches 
are external i.e. Yq^^ = 0. Let {ek,k > 1) be a sequence of i.i.d. standard exponential r.v., 
also independent from the xj^^ and Y^'^\ thus from Q, we have 

An) ^ ^(n)_e^_ 
(n) • 
k=l ~ ^ 

Our main result is the following weak law of large numbers for Here — > denotes conver- 
gence in probability. 

Theorem 1.1. For the total internal length of the Bolthausen-Sznitman coalescent, we have 

-/W 1. 



(logn)^ 



n 



Now noting that L^") = + and using ([2]) and our main result, we deduce the 
asymptotic distribution of the total external length E^'^\ 



Corollary 1.2. For the total external length of the Bolthausen-Sznitman coalescent, we have 

_£'{"•) _ logn — log log n — '^—^ Z — 1. 



(logn)^ 



n 

Observe that the Bolthausen-Sznitman coalescent can be seen as a special case (a = 1) of 
the so-called Beta{2 — a, a)-coalescent which class is defined for < a < 2. Mohle's work [ T5] 
shows that in the case < q < 1 the variable E^"'^/n converges in law to a random variable 
defined in terms of a driftless subordinator depending on a. For 1 < q < 2, we refer to [13! 
where it is proven that - cn2-")/nV°+i-" converges weakly to a stable r.v. of index 

a, c being a constant also depending on a (see also [21 [H |T0]). In Kingman's case (a — )• 2) a 
logarithmic correction appears and the limit law is normal (see [13.J ) . 

The remainder of the paper is structured as follows. In Section 2, we prove our main results 
using a coupling method which was introduced in [12] that provides more information of the 

chain = (X^"\ > 0). Finally, Section [s] is devoted to the asymptotic behaviour of the 
number of mutations appearing in external and internal branches of the Bolthausen-Sznitman 
coalescent. 
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2. Proofs 

2.1. A coupling. In this section, we use the couphng method introduced in ^12j in order to 
study the number of jumps r^"^). 

Let {Vi)i>i be a sequence of i.i.d. random variables with distribution 



(4) 



F{Vi = k) 



1 



k>l. 



k{k + l) 

Note that ¥{Vi > k) = 1/k. Let Sn = Vi + ■ ■ ■ + Vn- It is weU-known, see for instance 
that 

Sn — n log n 



(5) 



n 



where Z is the stable random variable that appears in Q. We have the following functional 
limit result, with a limit, which is certainly a Levy process. 

Lemma 2.1. The process (L„(t),0 < t < 1) defined by 

T - '^LntJ - logn 

J^nV-j — 

converges weakly in the Skorohod space T>[0, 1]. 



n 



Proof. We first verify that the convergence of finite-dimensional distributions holds. Let 
t > 0, from we deduce 



Ln{t) 



S 



lnt\ 



[nt\ log(nt) + [nt\ logt 



-> tZ + tloet. 



n 



Similarly, if we take s < t, then 

L„(s), L„(t) - L„(s) ) > (Zi, Z2), 

where Zi and Z2 are independent random variables distributed as sZ + slogs and {t — s)Z + 
(t — s) log{t — s), respectively. The mapping theorem implies that 

^ (Zi,Zi + Z2). 



L„(s),L„(t) 

A set of three or more time points can be treated in the same way, and hence the finite- 
dimensional distributions converge properly. 

We now check tightness via Aldous' criterion. Let r„ be a L„-stopping time and a 
sequence of positive numbers such that 9n — )• 0, as n increases. Then for e > 0, we have 



Ln{Tn + On) ~ LniTn) 



e < 



> e 



S[ne^\ - [nOnl log (n6i„) 



nOr, 



+ 



\nOn\ log On 



n 



> £ 



which converges to 0. This completes the proof. 



□ 



In what follows, we use the following notation. For a stochastic process {Zn,n > 0) and a 
function c(n) write Z„ = Op{c{n)) as n — )• 00, if Z„/c(n) is stochastically bounded as n — )• 00, 



i.e. if lima;_!.oo limsup„ 



\Zn\ > xc{n)) = 0. We also write Zn = Op{c{n)) as n — )• 00, if 



Zn/c{n) goes to in probability. 
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From the above result, we deduce 



(6) 



sup 

l<fc<n 



Sk — k log n 



Op{n) 



We now define recursively {p{k))k>o, a sequence of stopping times such that p{0) = and 



n 



p{k + 1) = inf |i > p{k), V^ + Y,^ Vp^j) < 

with the convention inf{0} = oo. In other words, the sequence {p{k))k>i is the collection of 
indices of the r.v.'s Vi such that their sum does not exceed n — 1. It is proved in |12) that 
r^"^ and sup{/c, p{k) < oo} are equal in law, and that the terms of the block-counting Markov 

(n) 

chain of the Bolthausen-Sznitman coalescent can be represented as Xq = n, and 

k 



Xl> =n- 



i=l 



Next, we define 



cj(") = inf{/fc,/9(A;) > k], 



the first time that the random walk meets or exceeds n, and 

n 



(7) 



7 e (0,oo], 



■ (logn)i+T" 

with the convention 9^ = t^^\ Our first result allows us to consider the random walk 

(n) 

instead of the process of disappearing blocks until time 6^ . 
Proposition 2.2. Let < 7 < 7' < 00. Then as n goes to 00, we have 



(8) 
Moreover, 



1 



and ii^X^S 
n 4 



1. 



(9) 



for n sufficiently large. 



sup 

i<fc<6»: 



(n) 
7 



X 



(n) 



log n 



Op(logn), 



In order to prove this proposition, we first show that a similar result holds for the family 
of stopping times 



c,7 



h)=infU,xf)< 



cn 



(logn)-^ 



where c is a positive constant, and then note that for e > 0, 

in) 



'/l-e,7 — ^^7 — '/l+€,7 



1. 



Hence, the proof of Proposition 2.2 relies on the following Lemma. 
Lemma 2.3. Let < 7 < 00. Then as n goes to 00, we have 



4:^ < 



1 



and ii^^X^S 
cn ??c,7 
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Moreover, 



sup 



X 



in) 



-(^) - k 



log n 



Op (log n), 



for n sufficiently large. 

Proof. First we prove the result for 7 < 1. Observe, from Q, that for any e > 

n \l-e \ Sn/logn 

(lognj ^ ^ 



Vfc < — — , for ah k < 2n/logn ) > ( 1 
(logn)i-^' - ' J -\ 



n 



Now, since P(r("^ < j^^) — )• 1, as n increases (which follows from ([s])), we get 



(10) 



For simplicity, we write ry^") instead of ?7c" • Then, it follows 



sup Vk = Op 

l<fe<T(") 

in) 



n 



(logn 



a 



(") < 



Vfc>^fc"\, for some A; < r/(")) 

for some k < r^"^ 



< p n4 > 



(log n) 



cn 



sup Vk > , . 
a<fc<r{") (logn)T 



thus if we take e G (0, 1 — 7) in (10), we deduce 



11 



0. 



On the event {a^^^ > ry^")}, it is clear that 



sup 

i<fc<'()(") 



X 



in) 



X 



in) 
k-1 



1 



sup 



l<fe<7;{") X^_-^ 



Vk (logn)T 
(n) cn 



1<A;<t(") 



Hence, from (10) and (11), we obtain 
(12) 



sup 

l<k<ri(") 



X 



in) 



X 



in) 
k-1 



Op{l). 



In particular, since < < ^^2)_i> we get 



(13) 

Next, we note 



(logn)T^(„) 



1. 



Xi^) _ (^{") _ ^) log n = xj^^ -n + klog(^ 
and from ([s]), it is clear 



2n 



logn 

^i^^T^") = logn + Op(log logn). 



(n — r*-"-* log n) + k log ^ 



log 



n 
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Then on the event < a^'^\r]^"'^ < j^}, it fohows from ^ that 

2n 



sup 

l<fc<i7(") 



xf)-(rW-A;)log 



n 



< sup 

l<fc<2n/logn 



Sk-k log 

n log log n 
logn 



logn 



n log log n 
logn 



Finally using ( 13 ) and the strong Markov property for = X^^^(„) , we deduce 



X 



(14) 

Then, putting all the pieces together, we get 



in) 

^ 1 + Op 1 = ^ 

yn) (logn) 



1+7 



(n) 



sup 

l<fe<r?{") 

and since 7 < 1, 
(15) 



, ,^ log n 



supi< 



< 



sup 

l<fc<r)(") 



^ - (t(") - A;) log 



n 



T-{n) _ ^{n) 



Op ((logn)''' log logn) 



X 



(n) 



-(«) - k 



log n 



Op (logn). 



Next, we will prove (11), (12), (13), (14) and (15) for any 7 > 0. We show that this claim 
holds for 7 < p/2 for any p E N, using induction on p. The proof for p = 1 has just been 
done. 

For the induction step suppose that the asymptotics in (11) to (15) hold for 7 < p/2. 
For simplicity, we write t)*-"^ = 'n^p^2' '^^^ idea is to use the strong Markov property at the 
stopping time ry^") and apply the above results for 7 < 1 to the Markov chain xj^'^^ = xj^^^„^ 
started at n = X^"i) (instead of n = -'^o"'^). Define the family of stopping times 

C(")=inf{fc,4")<-^|. 

[ " (logn)^/'^J 

Observe that C^"^ = j)^"^ + 7/|"] . Hence, using the strong Markov property at the stopping 
time r}*^") and the behaviour in (13), with 7 = 2/3, we get 

1. 



ivC")''^ ^(n) 



X 



(n) 



Then, from this asymptotic behaviour and the induction hypothesis taken in (13), 

(logn) 3 + 2 p ^ 
-^r(") ^ ^■ 

cn n— >-oo 



From this behaviour, from ( 13 ) and from 



Qog^)^ yin) ^ . . (log ny yin) 
cn ' cn 

we obtain for p/2 < 7 < (p + l)/2 



(16) 



-+ 1. 
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Now, on the event {a^"^ > ry^")}, using the strong Markov property at r}^"-' and (11) with 

(n) 
fji' 



the initial state Xl^l-^, we get 



-> 0. 



The induction hypothesis gives P((T^"'' > f}*^")) — )• 1, as n goes to oo. These two facts together 
lead to 



1, 



for7G (p/2,(p+l)/2]. 

From (12) and again the strong Markov property at f/("\ we get 



sup 

r)(")<A;<C(") 



X 



(n) 



X 



in) 
k-1 



Op(l). 



From the above behaviour, ( 16 ) and the induction hypothesis, we have 



sup 

l<fc<77(") 



X 



in) 



X 



in) 
k-1 



1 



for 7 G (^j/S, {p + l)/2]. Again the strong Markov property together with the above behaviour 
give us for all 7 G (p/2, {p + l)/2], 

cn 
(log n) 



_ ^in) 



From ( 15 ) and the strong Markov property, we get 



sup 

»5(")<A;<C(") 



X 



in) 



-in) - k 



in) 



(n) 



in) 

We know from the induction hypothesis that \ogX:,\ ~ logn, as n goes to 00. We then 



(n) 



obtain, from (16) and using again the induction hypothesis. 

Op (logn). 



sup 

1<A:<»?(") 



X 



in) 



-in) - k 



log n 



for all 7 G (p/2, (p + l)/2]. Hence the induction is complete and the behaviour in (11) to (15) 
hold for any 7 > 0. □ 

Proof of Proposition 2.2. We first recall the definition of 9^'^ in (7) and define r/^^ = f/i"^^ 
and r?^'' = From (14), it is clear 



1, 



and from (11), we deduce 



„in) > ^in) 



1. 



Thus the first asymptotic behaviour in (jSl) holds. Also note 



vin) ^ ^ yin) 

ri_ OX/ rj. 



then the second asymptotic behaviour in ([s]) follows from (13). 
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From (15), we get 



sup 

l<fc<6». 



(n) 
7 



X 



in) 



-in) 



logn 



Op (log n) 



which gives ^ for 7' = oo. Also 
k 



sup 

i<k<e. 



(n) 

7 



q(™) 



-(n) _ f. 



1 



sup 

l<fc<0. 



(n) 
7 



Jn) 



-(n) _ 



< 



q{") 



-(n) 



q(") 
■^7 



(log ra) 
(log n) 



7(l+Op(l))- 



This give us ([9|. This completes the proof. 
2.2. Proof of Theorem 11.11 We first define 



□ 



k=l 



which is obtained by replacing the exponential random variables e^'s by their mean and 
approximating the denominator. Similarly, we define 



k=l 



(n) 

which is obtained by replacing the random variables by its conditional expectation. This 
new formulation is of interest. Indeed, similar as in |T1] it is possible to determine via 

(n) 

a recursive formula. Let Zi be the number of external branches after k jumps, k > 1, and 



we take conditional expectation to each Zj^ with respect to X^") and Zj^^. Observe that 
Hence, this random variable is distributed as an hypergeometric r.v. with parameters X 



(n) 

Z^ is the number of external branches which participate to A;-th coalescent event. 

{") 



Zf\ and 1 + Vl"'. It is then clear 



M) 



E 



then 



and 



Z 



E 



(n) 



x^--\zt\ 



z 



(n) 
k-\ 



l + [/, 



(n) 



z^ 



(n) 
k-1 



X 



(n) 
k-1 



z 



E 



(n) 



E 



Finally, since 
(17) 



(n) 



X 



(n) 



Xin) 

Z^\ it follows 




yin) 
'k-1 



n 

1=1 



n 



(n) 



X 



(n) 
k-1 



X 



(n) 



1 



1 



X 



(n) 



k=l \ i=l \ 

This last expression is a good way to understand the asymptotic behaviour of the total 
internal branch. 

The following lemma provides the asymptotic behaviour of 
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Lemma 2.4. As n goes to oo, 



(logn)^ 



n 



in probability. 



Proof. Let e > and take 0f ^ as in (|7|). We also let 6^' = [Ofjj and 0^"^ = and 
consider /^"^ as it is given in (17). We now split J*^"^ in two parts, as follows 



and 



, _„(n) \ i = l \ / / 



Note that 



which implies that 



(logn)2 



n 



< 



n 



+ 1, 



(logn)2+^ 
-)• 0, almost surely. 



(n) 

Then it is enough to study the behaviour of II . In order to do so, we first note 



k j-1 



i=l A . ^-^2 i=l i=l \ / i=l 

(This can be viewed as two Bonferroni inequalities for independent events with entrance 

probabilities 

On the one hand, 

k=l i=l i=l 



and thus 



E 

1=1 



1 



tlx — i sr-^ sr-^ 1 ' — i 



(n) 



From ([9|, we get 



fc=l i=l 



g(n)_ 



(n) 



logn 



k=i i=i ^ 



From the fact that 6'i''\ 6*^"^ ~ r^") ~ n/ log n, as n — )• oo, we deduce 



n 
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On the other hand by inverting the sums, we obtain 



n(n) 



-1 k 7-1 



g(n) 



1 



)(") J'-l 



Ey^ 1 ^ y^ y^ y^ ^ ^ y^ ^+ J y^ ^ 

A:=l i=2 i=l j=2 k=j 1=1 j=2 i=l 



(n) 



Using ^, we obtain 



y- y-y- 1 y- T^-j ^y^ i ^ i + opji) y- y- J_ 

fc=l j=2 i=l -^j -^j 0=2 i=l j=l i=l 



< 



n 



and finahy 

^+ -1 k j-1 ^ 

^ y{n) Yin) — (loErn)^ 

k=l j=2 i=l V 6 ; 

Putting all the pieces together give us 

(log^)^ fin) P , , 
-'i ^ J-) 



n 



which ends the proof. 



□ 



In order to prove Theorem we just need to control our approximation. This is the aim 
of the next two lemmas. 



Lemma 2.5. As n goes to oo, 



jin) _ fin) ^ Op{^/n) 



Proof. Recall that X*^") denotes the Markov chain {X^\ k > 0). A simple computation gives 



us 



T-{")-l r(")-l 
Tin) _ fin)\ _ ST Y^n) ^k- ST^ 

~ k ^^(n) ^ Z^ ^k ^(n),^An) 



k=l 



X 



k=l 



Conditionally on X^^\Y^"'\ the random variables Y^"'^ ^ are independent with zero 



in) 



mean. This implies 

r(")_l 



E 



k=l 



in) gfc - 1 



j^in) y(n) 



X, 



r(")-l /^(n)\ 2 
k 1 



E 

A:=l 



X 



in) 



where the inequality follows from the fact that Y^^^ < xjj^^ a.s. Chebychev's inequality 
implies 



r(")-l 



E n'"'^ = op( 



n). 



Again using that Y^^^ < xjj^^ a.s., we get 



r("'-l 

E n 



(n) 



efc 



< 



r(")-l 

E 



< 



A:=0 



n-l 

E 



efc 



{k-iy 
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It is a classical result (used also for the total length of Kingman coalescent) that 

n-l 

ei. p 

> 1, 



1 \ ^ efc 
)K n f--^ (k — 1 

which implies that 



^ogn ^ (/c - 1) 



r(")-l 

T n^"^^X-^ri = Op(logn). 

This completes the proof. □ 



Lemma 2.6. As n goes to oo 

=Op(V^). 

Proof. We proceed similar as in |14j . Recall that is the number of external branches 
after k coalescing events. Since Y"^^"^ = xj^^ — 

Jin) _ Jin) ^ _ ^f-^ 4")-E[4-)|XH] 
k=l 

Also recall that Z^"'^ — zj^-^ has a conditional hypergeometric distribution, given 
Therefore 

^(n) (n) _ , 

^(n) _ ^(n) _ /-^W _|_ fc-1 _ ^(n) _ ^(n) ^fc ^ _ jjin) 

k k—1 y k ' ~yin) fc— 1 -i^(n) ' 

where denotes a random variable with conditional hypergeometric distribution with 

parameters ^fc"'*! and 1 + C/^"^ as above, centered at its (conditional) expectation. For 

^(n) ^^(n)_jg^^(n)|^(„)j 



it follows 



v-in) _ 1 



(n) 

Iterating this linear recursion we obtain because of D^q'' = 



P,(n) fc rr(n) fe 

T^(n) Z-^ -^(n) 11 T^in) 

i=l "^j i=j+l 



and consequently 

jin) _ fin) ^ X^T^fc) n Yl T;fcy X] n {^~^An))- 

k=l j=l i=j+l j=l k=j i=j+l 

Now, since the //f ^ are centered hypergeometric variables, they are uncorrelated, given X^"' . 
Also from the formula for the variance of a hypergeometric distribution 

^in) 

E[(i7]"))V("\^j-\]<(^f^ + l)^ 
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thus, since 'z!'^\ < -'^j-i a.s. 

E[(Fj"V|X(")] < {Uf^ + 1). 
Putting everything together we obtain 

E[(/W_/("))2|xW] < 



,=1 {^f^Y 



E n (i-^ 



The product can be estimated by 1, thus 
By means of r^*^) — j < -''^o-"'' 



(")\2 



r(")-l 



- /(n))2|j^(n)] < ^ (^W + 1) < n + r„ < 2n. 

Now an apphcation of Chebychev's inequahty gives the claim. 

3. Application to population genetics 



□ 



Let us now suppose that mutations occur along genealogical trees according to a Poisson 
process of intensity /i. We write by M^") for the total number of mutations in the Bolthausen- 
Sznitman n-coalescent. The Poissonian representation implies that, conditionally on L^"^ 
M(") is distributed as a Poisson r.v. with parameter fiL^'^\ Mutations can be divided as 
external and internal according to the type of the branches where they appear and we denote 
them by and Mj'^\ respectively. 



Proposition 3.1. As n goes to oo, 



(logn)2^(n) 



n 



in probability and 



in distribution. 



n 



— fi log n — fi log log n — )• ^{Z — 1) 



Proof. Let = {Nt,t > 0) be a Poisson process with parameter fi. We first note that 
conditionally on /^"^ , Afj"^ has the same distribution as Aj(„) . This implies 



E 



M 



(n) 



E 



E 



Since /^"^ — t- oo a.s., thanks Theorem 1.1, we deduce that Nj(n)/I^'^^ — t- /i in probability and 



M 



in) 



d Aj(„) 



E 



M 



(n) 



E 



j(n) 



thanks to Theorem 1.1 
^n/(logn)2, as n — )• cxD. 



Therefore, the first result follows from E[Mj 



(")i 



,E[/(") 
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To get the second part of this proposition, we just need to observe that M^") = m]"'* +M^^ 
satisfies (see Corollary 6.2 of [7]) 



(log») ^(n) _ i^iogn- j2 log log n /xZ, 



n 

in distribution. □ 
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